Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libc] Enhance debugging output for amalloc and dmalloc #2144

Merged
merged 4 commits into from
Dec 19, 2024
Merged

Conversation

ghaerr
Copy link
Owner

@ghaerr ghaerr commented Dec 19, 2024

This PR allows for better understanding of debug output from the arena and debug malloc allocators, and new debug output was added to fmemalloc.

Now, when running sysctl malloc.debug=2 before the toolchain, a one-line summary is displayed when malloc, free or fmemalloc is called. This information has been used to tune the 8086 toolchain programs.

For instance, the following is output when CPP86 is run:

(13)FMEMALLOC(16000) total 16000, maxsize 16000
(13)malloc(   17) alloc    26, free 15974, arena    26/16000
(13)malloc( 1024) alloc  1052, free 14948, arena  1052/16000
(13)malloc(   19) alloc  1074, free 14926, arena  1074/16000
(13)malloc(   17) alloc  1094, free 14906, arena  1094/16000
(13)malloc(   20) alloc  1116, free 14884, arena  1116/16000
(13)malloc(   36) alloc  1154, free 14846, arena  1154/16000
(13)malloc( 1024) alloc  2180, free 13820, arena  2180/16000
(13)malloc(   26) alloc  2208, free 13792, arena  2208/16000
(13)malloc(   20) alloc  2230, free 13770, arena  2230/16000
(13)malloc(   29) alloc  2262, free 13738, arena  2262/16000
(13)malloc(   20) alloc  2284, free 13716, arena  2284/16000
(13)malloc(   36) alloc  2322, free 13678, arena  2322/16000
(13)malloc( 1024) alloc  3348, free 12652, arena  3348/16000
(13)  free( 1024) alloc  2322, free 13678, arena  3348/16000
(13)  free(   36) alloc  2284, free 13716, arena  3348/16000

This shows the single fmemalloc for the arena malloc init, followed by each of the malloc/free's that CPP86 performs. It also shows the maximum use of the arena heap is 3348 bytes out of 16000. (I have lowered this from 64K significantly as a result). CPP86 does not use any fmemalloc calls from the arena malloc wrapper.

Here is the result from running C86:

(15)FMEMALLOC(16000) total 16000, maxsize 16000
(15)malloc(   36) alloc    44, free 15956, arena    44/16000
(15)malloc( 1024) alloc  1070, free 14930, arena  1070/16000
(15)malloc(   36) alloc  1108, free 14892, arena  1108/16000
(15)malloc( 1024) alloc  2134, free 13866, arena  2134/16000
(15)FMEMALLOC( 3076) total 19076, maxsize 16000
(15)malloc(  640) alloc  2776, free 13224, arena  2776/16000
(15)malloc(  512) alloc  3290, free 12710, arena  3290/16000
(15)FMEMALLOC( 3076) total 22152, maxsize 16000
(15) fmemfree()
(15)  free(  512) alloc  2776, free 13224, arena  3290/16000
(15)  free(  640) alloc  2134, free 13866, arena  3290/16000
(15) fmemfree()

Note only 3290 bytes of 16000 from arena heap, and two separate fmemallocs of 3076 bytes.

NASM86, on the other hand, ends up making thousands of malloc/free requests, all small, and can run in as little as 8K arena heap. However, it also makes tons of fmemalloc calls of large sizes. Nonetheless, I have used this information to reduce the memory requirements of NASM86 to an 8K arena with a 64 byte arena/main memory threshold.

I have used the results of this testing to lower the memory requirements of the 8086 toolchain. I will be submitting a PR there shortly.

Tested on QEMU making test.c. Not tested with chess.c - it will be interesting to see how much more memory is used.

@ghaerr ghaerr merged commit eae676f into master Dec 19, 2024
2 checks passed
@ghaerr ghaerr deleted the debugmalloc branch December 19, 2024 05:10
@rafael2k
Copy link
Contributor

rafael2k commented Dec 19, 2024

btw, sysctl malloc.debug=2 is nice!
but freezes the system at umount.
writes:
START
and freezes

[EDIT]: I forgot to recompile after this commit - doing it now

@ghaerr
Copy link
Owner Author

ghaerr commented Dec 19, 2024

sysctl malloc.debug=2 is nice!
but freezes the system at umount.

I forgot to recompile after this commit - doing it now

Ok. Does setting sysctl still crash the system? If so, does this happen after any particular toolchain program, or just one of them?

@rafael2k
Copy link
Contributor

rafael2k commented Dec 19, 2024

I'm not sure what I did, but after system compilation, no system crash.

[EDIT: managed to reproduce. Just start compilation, ctl+c and umount /mnt]

@rafael2k
Copy link
Contributor

image

@ghaerr
Copy link
Owner Author

ghaerr commented Dec 19, 2024

managed to reproduce. Just start compilation, ctl+c and umount /mnt]

We're probably experiencing memory corruption. "START" is the first thing the kernel displays, that was put in there to show the kernel is restarting, which should never happen and is usually the result of a CALL 0 or something like that.

To debug this, we can either 1) increase memory arena to see if that changes anything, 2) don't do ^C to see if the problem has to do only with signal handling (this is a definite possibility, perhaps the signal handling in make86 is causing corruption, and 3) see whether this happens using a shall script like ecc running the compilations rather than make.

We also need to try to find which tool is causing the corruption. MAKE is not running OWC so its not running the new arena code. I will put together a wrapper for it that will cause it to run the debug v7 allocator __dmalloc for IA16. That will allow us to see the memory it is using. Not sure what the problem is yet.

It is nice you have a repeatable test case. Umount doesn't do anything special, that's probably just the area of code that got corrupted that ends up causing the kernel crash.

@ghaerr
Copy link
Owner Author

ghaerr commented Dec 19, 2024

A potential issue with large model is that all pointers are far and can point to anything. So a bad pointer can easily scribble onto kernel code or data. Thus I have been working on even tighter restrictions and better checking for our memory management routines, but any internal bug in the program itself can also show up as a kernel crash.

@rafael2k
Copy link
Contributor

rafael2k commented Dec 19, 2024

I can confirm it has nothing to do with the printfs. After I Ctrl+C nasm and try to umount the floppy, I see the same crash.

[EDIT: Ctrl+C was not needed]

@rafael2k
Copy link
Contributor

rafael2k commented Dec 19, 2024

So, it was nasm destroying the memory. Fixed:
rafael2k/8086-toolchain@8e8d445

@ghaerr
Copy link
Owner Author

ghaerr commented Dec 19, 2024

[EDIT: Ctrl+C was not needed]

That's very good to know, I'll cross off worrying about possible undefined behavior occurring during a signal callback.

So, it was nasm destroying the memory. Fixed:

Well, I am glad you have this working again at 32K heap rather than 24K. But IMO, this isn't likely to fix the problem, as it seems the problem is a stray pointer, very possibly and most probably a bug in NASM. Changing the max heap just shifts around pointer values, so it could very easily be that NASM is now just writing somewhere else in memory that we can't see.

Of course, it is possible that arena malloc is the culprit, but CPP and C86 are running fine, and arena malloc was meticulously crafted from the already working V7 malloc.

Until we can get the crash to repeat in NASM on host, it will likely be hard to debug. However, after @toncho11's testing on real 8088 hardware, it's becoming pretty obvious that NASM isn't ever going to work as our core assembler - its way too big, way too slow, and possibly buggy. I think we should try to find another assembler that takes very similar to NASM-style (or close, possibly MASM) assembler input and switch completely to that. I plan on making C86 work with dual ASM output, both NASM and AS86 format, so that we can in the interim switch to AS86 for the devkit. I hope to get that working while you're on holiday.

After seeing the results of NASM on both your system and @toncho11's, getting completely rid of NASM, the sooner the better! The memory overwrite problem could easily come back at us at any time, and our entire toolchain would be in trouble. Of course, it just isn't fast enough on any real hardware anyways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants